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Abstract 

We present a theory which describes a recently introduced model of an evolv- 
ing, adaptive system in which agents compete to be in the minority. The 
agents themselves are able to evolve their strategies over time in an attempt 
to improve their performance. The present theory explicitly demonstrates the 
self-interaction, or market impact that agents in such systems experience. 
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I. INTRODUCTION 



Agent-based models of complex adaptive systems (CAS) provide invaluable insight into 
the highly non-trivial global behaviour of a population of competing agents |IJ . These models 
typically involve agents with similar capability competing for a limited resource. The agents 
are given the same global information, which is in turn generated by the action of the agents 
themselves, and they learn from past experience. The growing field of econophysics 
represents an area in which such CAS may be applicable: every agent knows the past ups 
and downs in the index of a stock market and must decide how to trade based on this 
global information. An important step forward in agent-based models of CAS was made by 
Challet and Zhang f|]|| who proposed the so-called Minority Game (MG) in which an odd 
number N of agents successively compete to be in the minority. Each agent is randomly 
assigned a limited number of strategies at the beginning of the game, hence introducing some 
quenched disorder. As the game progresses, non-trivial fluctuations arise in the collective 
agents' decisions - these can be understood in terms of the dynamical formation of crowds 
consisting of agents using correlated strategies, and anticrowds consisting of agents using 
the anticorrelated strategies |7]]. Subsequent work by Challet and co-workers has provided 
a remarkable formal connection to spin glass systems ||. 

The basic minority game, however, does not incorporate evolution. Agents are stuck with 
their initial strategies and hence the system cannot avoid this in-built frustration. In the 
real world, one would expect that agents would be able to evolve more successful strategies, 
or at least stop playing disasterous strategies. This motivated us to recently propose a 
simpler minority model which allowed for an evolving population j9HlT| - we call this the 
evolutionary minority game (EMG). D'Hulst and Rodgers [12[] subsequently proposed an 
analytic theory, based on a slightly modified version of our model. However, the two models 
actually give different numerical results flTT| . 

Here we provide a theory for our evolutionary minority game (EMG) [[| which correctly 
includes the self-interaction of the agents. Results are in good agreement with numerical 
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data. The plan of the paper is as follows. We introduce the EMG in Sec. II and give the 
main features observed in numerical simulations of the model. In Sec. Ill, we present the 
formalism and derive the winning probability for an agent. Results from the present theory 
are compared with numerical data in Sec. IV. Section V provides a discussion of the results. 

II. EVOLUTIONARY MINORITY GAME 

Consider an odd number N of agents repeatedly choosing to be in room (e.g. sell) or 
room 1 (e.g. buy). After each agent has independently chosen a room, the winners are those 
in the minority room. A single binary digit denoting the minority room forms the outcome 
for each time-step. Each agent is given the information of the most recent m outcomes. Each 
agent also has access to a common register or "memory" containing the outcomes from the 
most recent occurrences of all 2 m possible bit strings of length m. Consider, for example, 
m = 3 and denote (xyz)w as the m = 3 bit string (xyz) and outcome w. An example memory 
would comprise (000)1, (001)0, (010)0, (011)1, (100)0, (101)1, (110)0, (111)1. Following a 
run of three wins for room in the recent past, the winning room was subsequently 1. 
Faced with a given bit string of length m, it seems reasonable for an agent to simply predict 
the same outcome as that registered in the memory. The agent will hence choose room 1 
following the next 000 sequence. If turns out to be the winning room, the entry (000) 1 in 
the memory is then updated to be (000)0. Simply put, each agent looks into the most recent 
history for the same pattern of m bit string and predicts the outcome using the history. In 
effect, each agent holds one strategy and all agents hold the same strategy, with the strategy 
being dynamical. The strategy is hence to follow the trend. However, if all N agents act in 
the same way, they will all lose. A successful agent is one who can follow a trend as long as it 
is valid and to correctly predict when it will end. To incorporate this factor into our model, 
each agent is assigned a single number p, which we refer to as the "gene" -value. Following 
a given m-bit sequence, p is the probability that the agent will choose the same outcome as 
that stored in the memory, i.e. he will follow the current predictor. An agent will reject the 
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prediction and choose the opposite action with probability 1 — p. To incorporate evolution 
into our model, we assign +1 (—1) point to every agent in the minority (majority) room at 
each time step. If an agent's score falls below a value d (d < 0), his gene- value p is modified. 
The new p value is chosen randomly from a range of values centered on the old p with a 
width equal to R. We impose reflective boundary condition to ensure that < p < 1. Our 
conclusions do not depend on the particular choice of boundary conditions. For R = 0, 
the agents will never change their gene values - this represents the limiting case of in-built 
quenched disorder determined by the initial distribution of p values. For any non-zero R 
value, the system is able to evolve through gene modification. For R = 2, the new gene 
value is uncorrelated with the old one upon modification. 

Initially, each agent is randomly assigned a gene value in the range < p < 1. Choosing 
R ^ allows the population to evolve. We focus on two quantities, P{p) and L(p), in the 
asymptotic limit. Here P(p) is the frequency distribution of gene values, typically taken 
in the long time limit over a time window and normalized to unity; L(p) is the lifespan 
defined as the average length of time a gene value p survives between modifications. To 
introduce the basic features observed in numerical simulations, Fig. 1 shows L(p) and P(p) 
(inset) as a function of p for a range of values of m. The other parameters are taken to be 
N = 101, R = 0.2 and d = —4. The most interesting feature is that P(p) becomes peaked 
around p = and p — 1, with a similar behaviour in L(p). Both of these quantities are 
symmetric about p = 1/2. The results are insensitive to the initial distribution of p values. 
Surprisingly the results indicate that agents who either always follow or never follow what 
happened last time, generally perform better than cautious agents using an intermediate 
value of p. Figure 1 also shows that there is no explicit dependence on m for P(p) and L(p) 



11.13]. The independence on m of the results was also discussed recently by Burgos and 



Ceva |L3| using a random walk argument. Reference [I2| proposes a theory which gives a 
P(p) somewhat similar to that shown in Fig.l. However, the theory was developed based on 
a model in which each agent is initially assigned one strategy from the strategy pool, and 
uses this strategy throughout the game: the corresponding P{p) is then m-dependent Jll]| 



in contrast to the EMG results shown in Fig.l. The dependences on the other parameters 
of the EMG such as N, d, and R are reported in Ref. [ |TTf . 

III. FORMALISM 

We consider a game with A agents (A ^> 1). After a sufficiently long time, the distri- 
butions P(p) and L(p) reach the stationary forms as shown in Fig. 1. Consider a certain 
moment of the game in this steady-state regime. Let the predictor, which is simply the 
strategy stored in the memory for the given history bit-string, be 1; i.e. go to room "1". As 
long as the winning room is defined as the minority room, i.e. with a cutoff at (A — l)/2, 
the following arguments do not depend on the actual value of the predictor and hence also 
hold if the predictor says 0. We define F^fa) as the probability of the attendance being n in 
the predicted room. It follows from the central limit theorem that -Fjv(n) will be an approx- 
imately gaussian distribution with a mean Np and variance A Jq P(p)p(l — p)dp. Here p is 
the mean of the gene value p given by p = Jq pP(p)dp, which is known if the distribution 
P(p) is known. However, P(p) is the unknown which we are going to solve for. In the steady 
state, -Fjv(n) becomes identical to the probability of the attendance in any one of the two 
rooms since the two possible outcomes occur equally often on average. Figure 2 shows the 
normalized F^(n) in the steady state extracted from the numerical simulations. 

In the spirit of self-consistent mean-field theories, the basic idea of the present formulation 
is to consider the interaction between a particular agent and the rest of the population. We 
present the formulation in a general way so that it can be readily generalized to different 
variations of our model. We consider the action of a particular agent, say the k-th player, in 
the background of the A — 1 other agents. Let G k N _ 1 {n) be the probability of the attendance 
being n in the predicted room, given that there are only (A — 1) agents participating in the 
game (i.e. excluding the k-th. agent). Then F^{n) can be written in terms of G%_ 1 as 

F N (n) = p k G k N _ x {n - 1) + (1 - p fc )C&-i(n), W 
where n ^ 0, A. Here pt is the p-value of the k-th. agent at that moment. The physical 
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meaning of Eq.(l) is transparent. An attendance of n in room "1" is achieved if the atten- 
dance by the (N — 1) agent background is n — 1 and the k-th agent decides to go to room 
"1": this leads to the first term in Eq.(l). Alternatively the attendance by the (N — 1) agent 
background is n and the k-th agent decides not to go to room "1": this leads to the second 
term in Eq.(l). 

Let r(pfc) be the winning probability of the k-th agent. Given the probability G^ r _ 1 (n), 
we can write 

(AT-3)/2 JV-1 

r(p k )= Pk E + E G N -i(n). (2) 

n=0 n=(N+l)/2 

Equation (2) says that the k-th agent wins if (i) the attendance is below (JV — 3)/2 in room 
"1" before he makes his move and he decides to go to room "1", thereby giving the first 
term or (ii) the attendance is above (N + l)/2 in room "1" before he makes his move and 
he decides not to go to room "1" , thereby giving the second term. Since the fc-th agent is 
only characterized by his gene value p k , r(p k ) can also be interpreted as the success rate of 
an agent using gene value p k . It follows from Eq.(l) that 

(iV-3)/2 (7V-3)/2 

E Mn)= E [ Pk (G%_An - 1) - G^n)) + G^n) 

n=l n=l 

(JV-3)/2 N _ o 

= E 4-iW+P*4-i(0)-p4-i(- t- )■ 

n=l Z 

Since F N (0) = (1 — p fc )G^ r _ 1 (0), which follows from the consideration that room "1" is empty 
only if the other N — 1 agents do not go to room "1" and the A;-th agent does not go to 
room "1", we have 

(N-3)/2 (N-3)/2 N _ o 

E G k N-i(n)= E F N {n)+p k G k N ^{^—). (3) 

n=0 n=0 Z 

Similarly we have from Eq.(l) 

N-l N-l 

E F N (n)= E [pkiG^in - 1) - G^n)) + G^n) 

n=(N+l)/2 n=(N+l)/2 

= E G%_ 1 (n)+p k G k N _ 1 (——)-p k G k N _ 1 (N-l). 

n=(N+l)/2 Z 



Since F N (N) = pkG^^N — 1), which follows from the consideration that all the agents go 
to room "1" only if all the other N — 1 agents go to room "1" and the fc-th agent goes to 
room "1", we have 



N-l N 



E G k N _ 1 (n)= £ FxW-pkG^^-Jl). (4) 

n=(JV+l)/2 n=(JV+l)/2 



Substituting Eqs.(3) and (4) into Eq.(2), we obtain 



(JV-3)/2 



r(Pk) = Pk E Mrf+PlGN-ii—z—) 

n=0 Z 

N N — 1 

n=(N+l)/2 

Using Eq.(l) to express in terms of G^-iC^t^) an< ^ Fn(^j^), we then obtain 

(AT-3)/2 AT 

T-(Pfc) = Pfc E f Jf( n ) + ( 1_ ft) E F N(n) 
n=0 n=(N+l)/2 

+Pk (M^ 1 ) - 2(1 - p)G k N . 1 (^)) 

(iV-l)/2 AT N — 1 

= Pk E *V(n) + (l-p fc ) E i 71 7v(n)-2 R (l- R )C Ar _ 1 (— — ). (5) 

n=0 n=(AT+l)/2 2 

Equation (5) separates r{pk) into 3 terms, each of which has a physically transparent inter- 
pretation. Consider an "outsider", i.e. someone whose action does not affect the outcome 
but instead is only betting on which side is the winning room according to the probability pk- 
His winning probability is given by the first two terms in Eq.(5). The third term gives the 
difference in the winning probability between an "outsider" of the game and an agent who 
actually participates in the game. This term is negative, reflecting the fact that an agent 
has a smaller probability of winning when he is actually participating in the game. Consider 
the case in which the background population is split evenly between room "0" and room 
"1": the k-th agent loses no matter what action he takes. Thus the third term represents 
this self-interaction term, or so-called market impact in financial market terminology. The 
Pk(l — Pk) factor means that the winning probability increases as the gene value pk deviates 
more from the value 1/2, and it produces a symmetry about p — 1/2 in L{p) and P(p) as 



shown in Fig.l. Note that Eq.(5) also applies to the case when the predictor says 0: hence 
it is independent of the dynamics of the predictor which in turn is determined by the time 
evolution of the outcomes. This further implies that the resulting P(p) and L{p) do not 
depend on the value of m in the model. For the present EMG, there is a lack of an a priori 
perferred room: therefore the outcomes and 1 will occur similar numbers of times on the 
average. In this case, the summations in the first and second terms of Eq.(5) in the steady 
state yield the value 1/2 and hence r(p) becomes 

T(p k ) = \- 2pk(l -Pk)G k N ^^±). (6) 

In order to express the right hand side of Eq.(5) entirely in terms of the function F, we 
use Eq.(l) to find From Eq.(l), we have 

P*C&-i(n - 2) + (1 - Pk )G k N ^(n - 1) = F N (n - 1). (7) 

Subtracting the equations obtained by multiplying Eq.(l) by (1 — p k ) and multiplying Eq.(7) 
by pk, we can eliminate G%_ 1 (n — 1) to obtain 

(l- Pk )F N (n) -p k F N (n - 1) = (1 - Pk ) 2 G k N _ 1 (n) - plG^n - 2). 

Repeatedly applying Eq.(l), we can eliminate G ? ^ r _ 1 (n — 2), — 3),- • • to obtain 

E(-l) B -^(j)( T ^-) B - i = (1 -PJG^n) . (8) 

i=0 1 - Pk 

Similarly if we apply Eq.(l) with increasing values of n instead of decreasing values of n, 
we obtain 

N l-m 

E {-ly—'FMm—^y— 1 = P.G^n) . (9) 

j=n+l P k 

Although the results are exact, in practice it makes sense to use Eq.(8) for small p k and 
Eq.(9) for p k ~ 1. Using Eq.(8) or Eq.(9) for n = ^f^- and substituting the result into 
Eq.(5), we obtain r(p k ) entirely in terms of Fn(ti), and the label k becomes irrelevant. As 
mentioned, r(p k ) can be regarded as the winning probability of an agent who is using a gene 
value p, and henceforth we denote it by r(p) for simplicity. 
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IV. RESULTS 



In order to obtain P(p) from r(p), we note that these two quantities are related. In Ref. 
i"2H , it was pointed out that the stationary distributions P{p) and L{p) are proportional to 



each other: 

P{p) 



constant, (10) 



L{p) 

where the right hand side is a constant independent of p. Equation (10) follows from the 
balance between the fluxes of agents into and out of a region in p-space in the steady state. 



Since an agent using the gene value p loses (1 — 2r(p)) points each turn [[L2| , the lifespan 
Lip) is given by 

\d\ 



rip) 



l-2rip) 



From Eq.(10), we have 



with the proportionality constant determined by the normalization of Pip) to Jq Pip)dp = 1. 

Based on the present theory, it is straightforward to construct an iterative calculation 
scheme for Pip)- The steps are the following: (a) assume a form for Pip), (b) obtain F^in) 
by evaluating p and the standard deviation from the assumed Pip), (c) use Eq.(5) together 
with Eqs.(8) and (9) to obtain rip), (d) calculate Pip) from rip) using Eq.(ll) and the 
normalization condition, (e) check for convergence of Pip) and, if necessary, repeat the 
steps until convergence is obtained. Note that Eq. (5) is employed since it is valid for all 
forms of initial guess for Pip), including those which are non-symmetrical about p — 1/2. 

Results for Pip) and Lip) obtained by carrying out the calculation scheme are shown 
in Figs. 3 and 4 together with results of numerical simulation for N = 51 and N = 101. 
Note that Pip), when properly normalized, is not sensitive to N, while Lip) depends on 
N . Results from our theory are in good agreement with numerical data. The results for 
Pip) as obtained in Ref. [|l2j are also shown in Fig. 3 for comparison: note that the results 
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of Ref. [|T^] show a plateau over a significant range of p in contrast to the present theory 
and the numerical simulations. The comparison indicates that the results from the present 
theory are in better agreement with the numerical results. To further test the validity of our 
theory, we compare results for r(p) as a function of p with numerical data for iV = 51, 101, 
and 201 in Fig. 5. The numerical data are found by simply counting the number of times 
an agent with gene value p wins. It should be noted that r(p) provides a better test than 
P(p) for the validity of any theory, since many forms of r(p) can give rise to similar forms 
for P(p). In contrast to the numerical results and those of the present theory shown in 
Fig. 5, the expression for r(p) given in Ref. [12|] gives a very small r{p) for a significant 
range of p around p = 1/2 corresponding to the plateau in P(p). Figure 5 suggests that the 
correct r(p) in the steady state, which follows from Eq.(5) (see also Eq.(6)), has the form 
r(p) ~ 1/2 — A(N)p(l — p) where A(N) is an iV-dependent constant which decreases with 
iV as 1/ y/N. Such a scaling with N makes sense from random walk arguments. 

V. DISCUSSION 

We have presented a theory of the EMG based on the consideration of a particular 
agent in the environment formed by the rest of the population. The winning probability 
r(p) is given in terms of the population distribution in one of the rooms. By relating the 
population distribution, the winning probability and the lifespan, an iteration scheme is set 
up for calculating the frequency distribution of gene values P{p). Results for P(p), L(p) and 
r(p) are in good agreement with numerical data. 

The present formalism can be used to describe different versions of the EMG. For ex- 
ample, a generalization of the EMG was recently introduced where the winning 'room' (i.e. 
winning decision) was assigned according to whether the attendance was lower than a cer- 
tain cutoff |TJ|]. For this case, one can modify the limits in the summations in Eq.(2) and 



carry out the calculations accordingly. We emphasize that Eq.(5) is applicable even if the 
steady state P(p) is not symmetric about p = 1/2. An interesting feature in this generalized 
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EMG model is that when the cutoff percentage deviates significantly from 1/2 and becomes 
smaller (or larger) than a critical value, the steady state P(p) takes on a form which depends 
on the initial distribution of p. In particular, the population distribution P(p) freezes - no 
further modification of gene values arises as time evolves for large (or small) enough value of 
the cutoff. This phenomenon is discussed in more detail in Ref . []nj . Another generalization 
is to modify the way in which the p- value is updated |y|. Future work will focus on ap- 
plication of the present theoretical approach to such generalizations of the simple minority 
game set-up. 
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FIGURES 

FIG. 1. The lifespan L(p), which is the average duration between modifications for a gene value 
p, as a function of gene value p for m = 1, 2, • • • , 8. The inset shows the distribution of gene values 
P(p) as a function of p for different values of m. Both L(p) and P(p) are insensitive to m. The 
other parameters are N = 101, d = — 4 and R = 0.2. 

FIG. 2. The probability of the attendance in one of the two rooms in the steady state, which 
is identical to F/v(n), obtained by numerical simulations. The parameters are N = 101, m = 3, 
d = —4 and R = 0.2. It is approximately a gaussian distribution as expected from the central limit 
theorem. 

FIG. 3. The frequency distribution of the gene values p as a function of p for N = 101 and 
N = 51 (inset). The other parameters are d = — 4 and i? = 0.2. The dotted lines are the data 
from numerical simulation. The solid lines give the results of the present theory. The dashed lines 
give the results of the theory proposed in Ref.[12]. 

FIG. 4. The lifespan L(p) as a function of p for N = 101 and N = 51 (inset). The dotted 
lines are the data from numerical simulation. The solid lines give the results of the present theory. 
Other parameters are the same as those in Fig. 3. 

FIG. 5. The winning probability r(p) as a function of p for different values of N. The solid lines 
give the results of the present theory while the dotted and dashed lines are results from numerical 
simulations. The three sets of lines from top to bottom at p = 1/2 correspond to N = 201, 101, 
and 51, respectively. Other parameters are the same as those in Fig.3. 
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